Finite Query Languages for Sequence Databases
نویسندگان
چکیده
This paper develops a query language for sequence databases, such as genome databases and text databases. Unlike relational data, queries over sequential data can easily produce infinite answer sets, since the universe of sequences is infinite, even for a finite alphabet. The challenge is to develop query languages that are both highly expressive and finite. This paper develops such a language. It is a subset of a recently developed logic called Sequence Datalog [19]. Sequence Datalog distinguishes syntactically between subsequence extraction and sequence construction. Extraction creates sequences of bounded length, and leads to safe recursion; while construction can create sequences of arbitrary length, and leads to unsafe recursion. In this paper, we develop syntactic restrictions for Sequence Datalog that allow sequence construction but preserve finiteness. The main idea is to use safe recursion to control and limit unsafe recursion. The main results are the definition of a finite form of recursion, called domain bounded recursion, and a characterization of its complexity and expressive power. Although finite, the resulting class of programs is highly expressive, since its data complexity is complete for the elementary functions.
منابع مشابه
انتخاب مناسبترین زبان پرسوجو برای استفاده از فراپیوندها جهت استخراج دادهها در حالت دیتالوگ در سامانه پایگاه داده استنتاجی DES
Deductive Database systems are designed based on a logical data model. Data (as opposed to Relational Databases Management System (RDBMS) in which data stored in tables) are saved as facts in a Deductive Database system. Datalog Educational System (DES) is a Deductive Database system that Datalog mode is the default mode in this system. It can extract data to use outer joins with three query la...
متن کاملFormal Languages and Algorithms for Similarity Based Retrieval from Sequence Databases
The paper considers various formalisms based on Automata, Temporal Logic and Regular expressions for specifying queries over finite sequences. Unlike traditional semantics that associate true or f alse value denoting whether a sequence satisfies a query, the paper presents distance measures that associate a value in the interval [0, 1] with a sequence and a query, denoting how closely the seque...
متن کاملSequence Datalog: Declarative String Manipulation in Databases
We investigate logic-based query languages for sequence databases , that is, databases in which strings of symbols over a xed alphabet can occur. We discuss diierent approaches to querying strings, including Prolog and Datalog with function symbols, and argue that all of them have important limitations. We then present the semantics of Sequence Datalog, a logic for querying sequence databases, ...
متن کاملOn the finite controllability of conjunctive query answering in databases under open-world assumption
In this paper we study queries over relational databases with integrity constraints (ICs). The main problem we analyze is OWA query answering, i.e., query answering over a database with ICs under open-world assumption. The kinds of ICs that we consider are inclusion dependencies and functional dependencies, in particular key dependencies; the query languages we consider are conjunctive queries ...
متن کاملQuery Languages for Sequence Databases: Termination and Complexity
This paper develops a query language for sequence databases, such as genome databases and text databases. Unlike relational data, queries over sequential data can easily produce in nite answer sets, since the universe of sequences is in nite, even for a nite alphabet. The challenge is to develop query languages that are both highly expressive and nite. This paper develops such a language as a s...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1995